Two-step machine learning optimizes nanoparticle synthesis | npj Computational Materials

2021-11-12 11:09:23 By : Mr. Lusin Lu

Thank you for visiting Nature. The browser version you are using has limited support for CSS. For the best experience, we recommend that you use a newer version of the browser (or turn off the compatibility mode in Internet Explorer). At the same time, to ensure continued support, we will display sites without styles and JavaScript.

npj Computational Materials Volume 7, Article Number: 55 (2021) Cite this article

In materials science, discovering the formulation of nanomaterials with defined optical properties is both expensive and time-consuming. In this research, we propose a two-step framework for a high-throughput microfluidic platform driven by machine learning to rapidly produce silver nanoparticles with the desired absorption spectrum. Combining Bayesian optimization (BO) based on Gaussian process with deep neural network (DNN), this algorithm framework can converge to the target spectrum after sampling 120 conditions. Once the data set is large enough to train the DNN with sufficient accuracy in the target spectral region, the DNN will be used to predict the response to synthesize an accessible palette. While maintaining human interpretability, the proposed framework effectively optimizes the synthesis of nanomaterials, and can extract basic knowledge of the relationship between chemical composition and optical properties, such as the effect of each reactant on the shape and amplitude of the absorption spectrum .

In recent years, machine learning (ML) methods have been applied to solve various problems in materials science, such as drug discovery 1, 2, medical imaging 3, material synthesis 4, 5, functional molecule generation 6, 7 and material degradation 8. Since the generation of experimental data in materials science is expensive and time-consuming, ML algorithms are mainly developed based on computational data or experimental data sets collected from the literature (if available). However, once the ML algorithm suggests materials, material synthesis becomes difficult or even impossible. The latest development of the microfluidic high-throughput experiment (HTE) platform now allows the use of a small amount of material to generate a large amount of experimental synthesis data10, 11, 12, 13. The integration of the ML algorithm with these flow chemistry platforms in a loop will ensure that the ML algorithm only suggests those materials that can be synthesized. This attempt has been carried out in the synthesis of nanomaterials14,15. However, these studies are limited to optimization problems and focus on sparse data sets, while large data sets are needed to extract knowledge about how chemical composition and process parameters affect the final result.

In this article, we propose a two-step ML framework that can drive the HTE platform from the beginning of the screening process (sparse data set) to a clearer screening state (big data set) to target predetermined optical characteristics, and Without any prior knowledge of the complexity of the model, extract knowledge about how chemical processes affect the optical properties of synthetic materials. Due to the inherent nonlinear competition between the nucleation of seed particles and the growth of pre-existing seeds in solution, the adjustment of wet chemical nanoparticle synthesis is notoriously challenging17,18. Therefore, the synthesis of silver nanoparticles (AgNP) was chosen to demonstrate the efficiency of the framework. AgNP synthesis is performed using a droplet-based microfluidic platform with five input variables, as shown in Figure 1, and detailed in the "Methods". Due to surface plasmon resonance, AgNPs have characteristic optical fingerprints in the UV-Vis range, depending on their size and shape distribution. In this study, we choose the theoretical absorption spectrum of a triangular nanoprism with an edge length of 50 nm and a height of 10 nm as the optical target. The absorption spectrum is calculated by plasmon resonance simulation using discrete dipole scattering (DDSCAT).

The two-step optimization algorithm framework (blue box) consists of the first HTE loop (run 2-5), where BO (blue dot) is sampling the parameter space to train the DNN, and the second loop (run 6- 8) The DNN (orange dot) is allowed to sample the parameter space to verify its regression function. The conditions recommended by BO and DNN were tested on a droplet-based microfluidic platform. Before feeding the BO, the absorption spectrum of each droplet is measured and compared with the target spectrum through a loss function, and the fully resolved absorption spectrum is provided to the DNN at the same time.

Traditional Bayesian optimization (BO) is usually chosen to drive the HTE cycle because it can effectively explore the parameter space and target specific material properties, even when started with sparse data sets5,19. However, BO did not give a general insight into the reaction process. In addition, its performance depends on the initial selection of model hyperparameters and the definition of loss-usually a single-valued output parameter to which the measured material properties are reduced. In order to extract knowledge from data, other studies use neural networks to train regression models and perform reverse design from fixed data sets 16, 20, 21. Although neural networks can even learn complex functions from the full spectrum, it has many hyperparameters and requires a large number of training data sets, which makes it difficult to integrate into machine-driven experimental loops with limited initial data and high evaluation costs, and is efficient Lower is used to explore the parameter space in the early stages of sampling.

The proposed two-step framework (Figure 1) combines the optimized assets of BO with the regression capabilities of deep neural networks (DNN). In the first step, after performing the first experimental run of 15 conditions using Latin Hypercube (LH) sampling, the optimization process is initiated using the batch mode BO with local penalty (LP)22,23. The BO algorithm (see "Methods") with Gaussian Process (GP) as an alternative model is used to explore the parameter space, where the boundaries are initially set by the experimenter and the chemical conditions that lead to the target spectrum are found. The definition of the loss function (see equation (7) in "Methods") takes into account the shape and intensity of the absorption spectrum. In each run, the BO algorithm selects the next batch of 15 conditions for testing based on the balance between minimizing loss (development) and minimizing uncertainty (exploration) determined by the decision strategy (acquisition function). At the same time, use the experimental data generated by BO sampling to train the offline DNN. In the second step, starting from the sixth run, when BO continues to suggest more than 15 conditions with the same hyperparameters and provides DNN with more data around the target spectrum, DNN is used to generate simulated spectrum parameters for all process variables Spatial grid. In this way, DNN can suggest another 15 conditions to minimize the loss function by sorting the predicted values ​​in the grid. The DNN architecture and grid optimization are described in "Methods". The conditions recommended by DNN and BO have been tested on the HTE platform. In subsequent runs, only the experimental data generated by BO sampling is used to train the DNN, so that the performance of BO and DNN can be directly compared. When the target spectrum is optimized by BO or DNN and the DNN regression is accurate and stable enough to extract chemical synthesis knowledge, the ML-driven HTE cycle will stop. The detailed data flow of this framework can be found in Supplementary Figure 1.

Here, we first prove that the proposed two-step algorithm framework effectively optimizes the synthesis of nanomaterials to obtain the required plasmon resonance. The optimized performance was verified by the TEM imaging experiment of the synthesized AgNPs. Next, by extracting BO and DNN regression functions, we show how the optimization process maintains human interpretability. Finally, once the stability and accuracy of the DNN regression function are established, we will use DNN to extract the basic knowledge about how the chemical composition and spectral properties of nanoparticles are related.

In order to evaluate the optimization performance of the framework, we tracked the evolution of the loss during continuous experimental runs. Each run contains 15 chemical conditions. For each condition, the spectrum of 20 copies of droplets is recorded and used to update the algorithm, and the median loss of 20 copies is calculated to deal with outliers. Since the median loss is used to update the BO, we define the condition with the lowest median loss among all the conditions that cause the operation as the best performance. In Figure 2a, we report the loss value obtained for each replica and the statistical distribution of the replicas under optimal performance conditions. In the first step of the framework, the median loss of the BO best performance condition drops rapidly in the first run, and then reaches a plateau from run 4.

a The evolution of loss under the recommended conditions of RS (green), BO (blue) and DNN (orange): each point represents a droplet. For each run, the condition that gives the lowest loss is determined as the best performer. The box plot shows the distribution of loss values ​​for all drop samples in the run, and the gray area represents the minimum loss value to be reached before stopping the HTE cycle. The absorption spectra of the best performers of BO (b) and DNN (c), as well as the relative size distribution of the triangular prism in the solution, indicate that the selected loss function allows the spectrum to converge to the target spectrum, and the prism converges to the edge of the triangle at 65 nm. The color of the absorption spectrum indicates the color of the best performing droplet. The TEM image shows the nanoprisms with median edge dimensions found in the sample. The scale bar corresponds to 50 nm.

The best conditions suggested by BO accumulate at the boundary of the parameter space (lower Qseed, higher \(Q_{\mathrm{AgNO}_{3}}\) ), indicating that a better condition may be found outside the parameter space boundary Good configuration. Active learning activities traditionally fix the boundary of the parameter space, which requires prior expertise in chemical synthesis. In our research, we are faced with the reality of experimental activities when exploring a parameter space that has never been explored before. The flexibility of parameter space expansion is essential for approaching the target spectrum. If the proposed conditions are too close to the set boundary of two consecutive runs, the framework can expand the parameter space. This happens in runs 4 and 5. Therefore, starting from Run 6, the flow restriction is relaxed to the maximum allowed by the device. No preliminary screening of the extended parameter space is performed. Both BO and DNN-based grid optimization use their knowledge in the initial parameter space to start exploring the expansion space. This extension allows to further reduce the median loss of the best performance conditions obtained by BO sampling. DNN sampling was introduced from run 6. Interestingly, the median loss of the best performance conditions obtained by DNN in Run 8 is significantly lower than BO (see Figure 2a).

In order to prove that our two-step method is reasonable, when we let it use grid search instead of using BO sampling conditions to sample the space by itself, we looked at the optimized performance of DNN. Starting from the same set of initial conditions in Run 1, we show that DNN cannot find lower loss conditions as quickly as BO (see Supplementary Discussion 2 for more details). In addition, we demonstrate the non-triviality of the BO method in the first step of the framework by comparing the performance of the BO method with the performance of random sampling (RS) (see Figure 2a): one can observe in the second run The performance of RS is better than BO. But in the third run, BO performed better than RS, and obviously began to converge to a lower loss value, while RS still chose a high loss value.

The convergence of the absorption spectrum with the best performance of BO and DNN to the target spectrum further validates the optimization. For the BO sample (Figure 2b), the main absorbance peak moves quickly to reach the target value (645 nm), while the absorbance intensity below 600 nm decreases. The evolution of the measured spectrum to the target spectrum verifies the efficiency defined by the loss function used in this study. Figure 2c reports the measured spectrum of the top performer of the DNN at each run, and the spectrum predicted by the DNN before sampling. Although the predicted spectrum in the 6th run is noisy due to insufficient training in the area that has just become accessible, the spectrum prediction tends to be smoother in the following runs, even though the DNN is still extrapolating from the sampling conditions .

The best performing samples are imaged by TEM (see "TEM Imaging and Analysis"). Two types of nanoparticle shapes are mainly synthesized: nanospheres and triangular nanoprisms. The sizes of the two shapes have wide heterogeneity. For different runs, the percentage of triangles remains around 30%. In addition, for the best performance of BO and DNN, the edge length of the triangular nanoprism (Fig. 2b, c) and the diameter of the nanosphere (see Supplementary Fig. 3) both increase with operation. The movement of the absorbance spectrum along with the operation moves along with the size distribution of the synthesized triangular nanoprism to the desired side length of the triangle. The size distribution of triangular AgNPs synthesized by BO (Figure 2b) and DNN (Figure 2c) becomes narrower with the operation, and the triangular edge converges to 65 nm, which is 30% larger than the target spectrum of the simulated nanoparticle size produced. This transition can be explained by increasing the thickness of the triangular prism: using TEM measurement, the prism thickness is estimated to be about 13 nm. In fact, the position of the absorption peak is determined by the aspect ratio between the prism edge length and its thickness.

It is worth noting that the method developed in this research uses all-optical absorption spectroscopy. In previous studies14, 15, the optimization process was performed using only certain properties of the absorption spectrum, such as peak wavelength, full width at half maximum (FWHM), and peak intensity. Although using limited spectral features may be a good choice for fast optimization problems, in the case of regression problems, it will reduce the spectral information used to train the DNN. Since the shape and size heterogeneity of AgNP leads to the superposition of absorption peaks, the entire spectrum contains information about the complete size and shape distribution of the nanoparticles. Although reducing the one-dimensional spectrum to a single loss value allows BO to remain efficient, DNN can use full spectral resolution to obtain effective optimization and allow DNN to accurately predict AgNP colors. This full spectrum method has two additional advantages. First, it does not require the DNN architecture to adapt to the number of detected features, in case the number of features changes during the parameter space screening. Second, it will make the framework robust to changes in the spectral target during the optimization process, because the loss function does not need to adapt to the absorbance target.

In order to understand how BO makes decisions in each successive run, we calculated the Pearson correlation matrix. Supplementary Figure 4 shows the corresponding correlation coefficients for the spectral shape and amplitude in relation to the total loss. From run 1 to run 5, compared with the amplitude (-0.55), the shape has a higher correlation coefficient (-0.93). Therefore, relative to the spectrum amplitude, the spectrum shape is mainly optimized in the initial parameter space. However, from run 6 to run 8, the correlation coefficient of shape (-0.63) becomes smaller than the correlation coefficient of amplitude (-0.95), indicating that the amplitude of the absorption spectrum is mainly optimized in the second step of the framework.

In the following, we investigate the reasons why DNN has good optimization performance in the second step, while the data set remains sparse in the extended parameter space. In the second step of the framework, both algorithms are improving their extrapolation accuracy. Since the DNN only uses the data obtained by BO sampling for training, we can compare its proxy function with the BO proxy function obtained through GP each run. Using SHAP (Shapley Additive exPlanations), we can sort process variables according to their importance: \(Q_{\mathrm{AgNO}_{3}}\) and Qseed are determined to be the most important, followed by QTSC, Qtotal and QPVA (see Supplementary Figure 5). Therefore, the {\(Q_{\mathrm{AgNO}_{3}}\), Qseed} space is selected to project the minimum loss obtained by the regression function onto the other three process variables. The minimum loss prediction obtained at the end of the 8th run is shown in Figure 3, which is used for three different functions: the original experimental data that fits the Gaussian distribution, the BO regression function, and the DNN regression function. The conditions suggested by BO and DNN both converge to similar regions in {\(Q_{\mathrm{AgNO}_{3}}\), Qseed} space (Figure 3a). The positions of the global minimum of the two algorithms are similar. However, it is found that the BO regression function has fewer features than DNN (Figure 3b, c). In Supplementary Movie 1, we report the evolution of the BO regression function from run 1 to 8 and the position of the next recommended condition for each run. We observed that the recommended conditions did not cluster as expected because BO chose the jitter value to perform in exploration and development mode. In addition, due to the sudden expansion of the parameter space, we observe a lack of experimental points between the global minimum and the second minimum. This can explain the recall of the local minimum observed on the projection of the BO regression function in Figure 3b. The second minimum is not observed in the DNN regression, which confirms the ability of DNN to better fit the parameter space.

Use (a) original experimental data, (b) BO and (c) DNN to regress the minimum loss 2D mapping obtained in {\(Q_{\mathrm{AgNO}_{3}}\), Qseed} space. The experimental conditions are represented by the blue disc (the conditions of operation 1 to 8 recommended by BO) and the orange disc (the conditions of operation 6 to 8 recommended by DNN). The asterisk represents the parameter condition of the best loss performance obtained by BO and DNN. The color corresponds to the loss value indicated on the color bar.

In order to further understand why BO is better than DNN, we check the minimum loss prediction of BO agent and DNN in the other three dimensions in {QTSC, Qtotal} space. The fringe feature appears in the BO projection, while the DNN performs correctly in the same subspace (see Supplementary Figure 6). BO cannot fit the Qtotal dimension correctly because the resolution of this dimension is ten times higher than other dimensions because the parameters are not normalized before BO training. The parameter normalization of the BO agent leads to a better projection in the {QTSC, Qtotal} space, but we choose not to normalize it in order to achieve greater flexibility in the case of the expansion of the parameter space. Although the non-standardization of parameters highly affects BO performance, DNN performs well in the Qtotal dimension.

The complexity of the relationship between chemical composition and optical performance can be explored by performing principal component analysis (PCA). It is found that neither linear nor nuclear PCA can help reduce the parameter space (see Supplementary Discussion 3). This shows that there is a complex nonlinear relationship between chemical parameters and spectra. Although some information can be extracted from the SHAP analysis: we observe that high \(Q_{\mathrm{AgNO}_{3}}\), low Qseed, low QTSC and high Qtotal values ​​are negatively correlated with the final loss. This not only provides information about the future direction of designing the experimental device, but also information about the area where the target spectrum can be reached.

The correlation matrix can also help understand the flow rate ratio \(Q_{\mathrm{AgNO}_{3}}\) /Qseed and QTSC/\(Q_{\mathrm{AgNO}_{3}}\) affect the spectral results (see Supplementary Figure 4). Although the ratio between the flow of silver nitrate and the silver seed flow-and therefore the concentration in the droplet-has a greater effect on the spectral amplitude, the ratio between the concentration of trisodium citrate and silver nitrate has a greater effect on the shape of the absorption spectrum. Big impact. This extracted insight is very important and is consistent with previous literature on the effect of trisodium citrate on the anisotropic growth of AgNPs synthesis.

In order to further determine which color palette can be achieved through this chemical process and to establish a mapping of accessible colors, we use a trained DNN to generate spectra in the parameter space. Before extracting any information from DNN, the accuracy and stability of DNN regression should be quantified. Although neural networks usually use a fixed data set, which is usually divided into two for training and verification purposes, the two-step framework integrates the DNN into the HTE loop and expands the data set every time it runs. The DNN is trained online using the data previously sampled by BO, and the verification step is performed using the data selected by grid optimization for subsequent experimental runs. Therefore, the accuracy of the DNN can be studied by comparing the absorption spectrum predicted by the DNN with the spectrum measured in the next run. One way to qualitatively express the prediction accuracy of the DNN is to report the cosine similarity between the measured spectrum and the target spectrum as a function of the cosine similarity between the DNN predicted spectrum and the target spectrum (Figure 4a). The data points are clustered around the diagonal, which means that the DNN predicts as close to the target as the measured spectrum. The accuracy can also be quantitatively estimated for each condition in two different ways: the cosine similarity between the predicted spectrum and the measured spectrum determines the accuracy of the shape of the absorption spectrum, and the mean square error (MSE) gives the error Estimate the magnitude of absorbance at each wavelength. Figure 4b shows that from the 6th run to the 8th run, the cosine similarity improves slightly, while the average MSE value decreases more significantly. DNN predictions are becoming more and more accurate in terms of spectral amplitude and spectral noise. In run 8, the MSE between the predicted and measured spectra became lower than our target value, arbitrarily fixed to 0.02, and the HTE cycle stopped.

a Compare the measured cosine similarity (the cosine similarity between the target spectrum and the experimental spectrum) and the predicted cosine similarity (the cosine similarity between the target spectrum and the DNN predicted spectrum of the 6, 7 and 8 runs) Compare. All data not used to train the DNN is used in this verification step: all DNN data, BO data from the last run, and random sampling data. Error bars represent the standard deviation of the cosine similarity measured on the conditioned droplet replica. The alignment of the data with the diagonal shows that DNN can correctly predict the shape of the spectrum. b For the conditions suggested by BO and DNN in this run, the cosine similarity and mean square error (MSE) evolution between the spectrum predicted by DNN and the spectrum measured in each run. The box plot shows the distribution of these values ​​considering all the drop samples run.

The stability of the DNN in terms of changes between runs can be studied by tracing the evolution of the regression function in the {\(Q_{\mathrm{AgNO}_{3}}\), Qseed} space of the first run. 8( Supplementary Figure 7). Although the BO projection will gradually change when the parameter space is expanded (the 5th run), the DNN mapping has changed dramatically in the first 5 runs. This shows that DNN is unstable when the training data set is small. The stability of the DNN regression function is evaluated by measuring the cosine similarity between two consecutive runs of the DNN on the {\(Q_{\mathrm{AgNO}_{3}}\), Qseed} plane in the parameter space. The QTSC, Qtotal and QPVA values ​​of DNN are fixed as the best performance conditions in Run 8 (see Supplementary Figure 8). In the initial and expansion space, we have observed a significant increase in stability during operation.

Once the stability and accuracy of the DNN is established, all the data generated during the experiment run will be used to train the final DNN. This DNN proxy model is used to generate the frequency spectrum of the entire parameter space. A software was developed to continuously navigate in the parameter space and display the predicted absorption spectra obtained under specific conditions (see Supplementary Movie 2). In addition, using the CIE 1931 color space, each absorption spectrum can be converted into the color that the human eye would see when observing the generated nanoparticle droplets. Figure 5 shows the colors seen by the DNN proxy model on the {\(Q_{\mathrm{AgNO}_{3}}\), Qseed} plane of the parameter space. For four different regions of the parameter space, compare the absorption spectra predicted by DNN with the experimental spectra obtained under similar conditions to illustrate the correlation of this representation. The diversity of colors obtained reflects the complex relationship between the absorption spectrum and the chemical composition of the droplets.

The color map of AgNPs predicted by DNNteacher in {\(Q_{\mathrm{AgNO}_{3}}\), Qseed} space, fixed value QPVA = 16%, QTSC = 6.5% and Qtotal = 850 µL/min, corresponding The conditions for the best performance of DNN in Run 8. The predicted spectra (gray dashed lines) are extracted from the four regions of the space (A, B, C, and D) and obtained under similar conditions with the experimental spectra (pure color lines).

Recently, many studies on active learning-guided synthesis have demonstrated the ability of BO to explore large parameter spaces with continuous and/or discrete variables14, 15, 26 and accelerate material discovery. Using an HTE platform similar to this research, Bezinge et al.14 added a kriging-based algorithm to the loop to optimize certain optical characteristics, and managed to extract some information about the spectral amplitude and FWHM at the target emission wavelength. knowledge. However, their proposed method uses a large data set of 40 experimental points in the 3D parameter space to initialize the ML algorithm, which is far more than ten times the dimension of its parameter space. Refer to Figure 27 for another example of the recent successful optimization of metal halide perovskite material synthesis, where a large initial experimental data set with more than 8000 conditions obtained by RS is used to train support vector machines and neural networks. How much experimental data is needed to accurately train a DNN or any other regressor is still an open question in ML. Since the complexity of the experimental parameter space is unknown at the beginning of the experimental activity, it is impossible to predict the amount of experimental data required to accurately train the regressor. Our current work proposes an experimental method to bypass this initialization problem by using BO to continue to provide data to the DNN, and the DNN has not been sufficiently trained to perform the reverse design.

Only one experimental run is needed to start the proposed two-step algorithm. Choose the batch size of the experiment run to minimize experiment costs. By increasing the batch size to 15, the time and solvent required in the cleaning process of our HTE platform is significantly reduced (see Supplementary Discussion 4). In order to avoid clustering of data points in the same batch, we choose LP as the collection function because it iteratively penalizes the points in the neighborhood that have been selected by the collection function. Gonzalez et al. 22 have demonstrated that LP performs better than extensive baselines in batch BO. Using such an acquisition function, for a batch size of 15 points or a batch size of 1 data point, the computational cost of evaluating the loss in the entire parameter space is similar. By choosing an appropriate batch size, the proposed two-step method can be applied to other HTE platforms.

When the data set is large enough, the neural network is chosen because of its regression ability. Neural networks have shown their potential to help extract basic knowledge from literature 28 and experimental data 29. Umehara et al.29 demonstrated the ability of CNN gradient analysis to extract optimization and component-attribute relationship observation lists in a high-dimensional large parameter space. Our workflow is to try to combine this DNN capability with a fast sampling parameter space HTE platform to extract knowledge in the loop. Epps et al. 30 recently proposed a similar method to integrate 500 neural networks into the HTE loop. This method allows testing of DNN hyperparameters during the HTE cycle. Its efficiency is proven in a simple DNN architecture (3 output parameters, 2 layers and up to 25 nodes). However, using such a neural network ensemble to predict the total absorption spectrum will significantly increase the calculation time. The solution we proposed consists of a single DNN with an architecture that adapts to the full-spectral resolution, trying to achieve a good compromise between computational costs to find the best choice of hyperparameters and regression accuracy.

In this study, due to the polydispersity of the nanoparticles, it is expected that the simulated target spectrum cannot be achieved because we deliberately chose a chemical route that is not the best way to produce monodisperse nanoparticles. Therefore, a lot of information is hidden in the complete absorption spectrum, and we choose to use the entire spectrum to train our algorithm.

By using the absorption spectrum of each droplet replica separately to train the two algorithms, the uncertainty of the experimental data is introduced into the BO and DNN algorithms. The surrogate model used for BO is a GP model that generates a probability estimate of the loss value. Although DNN is trained with all droplet replication, DNN has no probabilistic method.

In the proposed two-step method, BO was chosen because of its efficiency in processing sparse data sets and its simplicity, because it requires almost no adjustment of hyperparameters. GP performance is not necessarily the best choice for learning complex systems30, but in our case it allows fast optimization, while DNN is responsible for learning the complexity of the parameter space.

Other methods have recently been proposed for similar optimization and regression problems. Williams et al. 31 conducted a study similar to this work, in which the regression algorithm was trained using a full-factor grid search, and different regressors were ranked according to their accuracy. In their case, the random forest Performance is better. For the chemical synthesis considered in our study, using BO to sample the DNN space seems to be more accurate in predicting the location of the lowest loss than using a pure grid search method (see Supplementary Discussion 2). Another study by Häse et al.32 reported the efficiency of Bayesian Neural Networks (BNN) in optimizing nonlinear chemical reaction networks. However, BNN usually requires a larger data set than GP to become stable. Some other studies have shown that BO with adaptive kernels may find finer regression features33. However, most previous work has focused on optimizations that lack interpretability and transferability when goals change or reverse design using regression using static data sets20.

Since each parameter space has its own complexity, there is no single optimal algorithm choice for all experimental systems. However, once the two-step algorithm is implemented, experimental data can be used to compare different types of regressors and acquisition functions, and determine which combination performs best in terms of positioning accuracy of the best performer. The accuracy of the regression. The possibility of changing the acquisition function in the second step of the framework should also be considered, especially when closer to the target requires higher regression accuracy. Algorithm selection using information standards (such as Akaike Information Standard 34 and Bayes Information Standard 35) can also be used to maximize the time and resource efficiency of closed-loop laboratories, for example, through the use of co-evolution, physical fusion, and related strategies36,37.

In summary, we show the performance of the two-step framework algorithm, which combines BO and DNN with the HTE platform cyclically to optimize the synthesis of silver nanoprisms. After several targeted sampling using BO algorithm, DNN is introduced offline to speed up the optimization process. By tracking the evolution of the loss function and regression function during the run, we can determine at which run the DNN starts to better predict the area around the target location in the parameter space. The process is interpretable and knowledge can be extracted. The importance of features shows that even if every parameter works, silver nitrate and silver seeds are still the most influential parameters for targeting silver nanoprisms. The correlation matrix provides information about how the parameters and their ratios affect the shape of the amplitude of the absorption spectrum. In addition, the absorption spectrum around the target can be predicted to understand the sensitivity of the optical properties of synthetic nanomaterials to process parameters. In addition, the framework also trains a transferable algorithm, because the final trained DNN can be used to optimize the synthesis for different goals. In addition, the final DNN can be used for reverse engineering to synthesize nanoparticles with optical properties different from our original goal.

The developed method is generally applicable to the synthesis of other materials in the HTE cycle, and can be applied to other types of HTE platforms. A set of experimental data collected during this study can be used for further work to determine the performance of other acquisition functions and regressors in this two-step framework, and compare it with other single-step frameworks such as BNN.

BO has many advantages, making it suitable for sampling in the startup parameter space. Since the response surface between the process variable and the target loss is unknown, the optimization of the process variable can be regarded as the optimization of the black box function. BO has been proven to be superior to other global optimization methods on various benchmark functions38. Considering that the parameter space is continuous, GP is chosen as the surrogate model of BO. An important aspect of defining the GP model is the kernel and its related hyperparameters. This controls the shape of the regression function 39, which corresponds to the fit of the response surface between the process variable and the target loss.

We choose BO with GP proxy model for the following reasons: First, the implementation of BO with GP is not very sensitive to the initial choice of algorithm hyperparameter selection39. The functional relationship between process parameters and absorbance spectra f is expensive to evaluate and may be noisy. This eliminates most exhaustive search methods, such as grid sampling and RS. Reference 40 shows that, compared with genetic algorithm, BO requires a smaller initial data set and fewer iterations to reach the optimum. There are five different process variables in the experiment. This is the best location for BO41. There are many alternative models that can be selected for BO, such as GP, tree-based algorithms, and neural networks. Because the process variables are continuous, the tree-based algorithm is not used in this study. GP was chosen because the number of hyperparameters is much smaller than NN. In addition, the uncertainty of fitting GP is known, so it is easy to make a trade-off between exploration and development. Driven by these considerations, the combination of BO and GP is used to actively sample the chemical space.

GP is defined in the equation. (1). We can use symbols to express this equation: f(X)∼GP(m(X), k(X, X′)), where \({\mathbf{X}} = \{ x_1, \ ldots ,x_n\ }\) is the vector of process variables, m(X) is the mean function, and k(X, X') is the covariance matrix between all possible (X, X') pairs. We use the Matern 52 kernel in the covariance matrix and use LP22 to implement batch BO to suggest a batch of 15 data points to be consistent with the experimental settings.

We use Expected Improvement (EI) as the acquisition function to select the next experimental condition that is a trade-off between exploration and development.

Where f(X) is the value of the 15 best samples, and X is the position of the 15 data points.

The suggested point for the next experiment is the point that maximizes the expected improvement. EI can be parsed and expressed as:

Where \(\mu \left( {\mathbf{X}} \right)\) and \(\sigma \left( {\mathbf{X}} \right)\) are the mean and standard deviation of GP at X The posterior. \({\Phi}\) and φ are the cumulative density function and probability density function of the normal distribution. \(\xi\) is the jitter value that determines the ratio of exploration and utilization. The higher the \(\xi\), the more exploratory BO is. In this study, we fixed the jitter value to 0.1.

Supplementary Figure 2 shows the architecture of the neural network used in this work. This architecture was chosen to capture the complexity of the system while maintaining reasonable computing time. The input layer consists of five nodes, followed by four hidden layers (50 nodes, 100 nodes, 200 nodes and 500 nodes). The output layer consists of 421 nodes, corresponding to UV-Vis spectral data points. As an exploratory work with little knowledge of the parameter space, we choose ReLU for all activation functions to facilitate convergence, and the cost function is MSE. The weights and deviations are updated in each run of the HTE cycle. The number of initial hidden layers is determined by the equation. (6), Stathakis et al. 42 investigated this, where m is the number of output nodes and N is the number of data points. In this work, m is 421, and N is determined by the data points of each run (approximately 300). Goodfellow et al. 43 empirically proved that the use of multi-layer deep networks may be a heuristic method that can configure the network for challenging and complex predictive modeling problems.

Since the initial DNN was trained with very little representative data, the obtained function was not well trained. We perform a grid search on the entire parameter space to select the best formula with the least loss. The purpose is to enforce the regularization term in the optimization process. Starting from run 6, DNN has joined the optimization process. DNN 6 is constructed in accordance with the above specifications. After that, use BO to run the suggested experimental data from 1 to 5 to train and build DNN 6. DNN 6 is used as a mapping function between the five process variables and the corresponding UV-Vis spectra. Perform a grid search in the 5D parameter space to generate each spectrum corresponding to each data point: \(Q_{\mathrm{AgNO}_{3}}\), QTSC and Qseed are in the range of [0.5:80]% The point interval is 5%, the QPVA range is [10:40]%, the point interval is 5%, the Qtotal range is [200:1000] µL/min, and the point interval is 100 µL/min. After that, the loss of each spectrum can be calculated according to the loss function we defined, and arranged in ascending order. DNN 6 will select 15 combinations of the five variables with the top 15 smallest losses, and recommend them to the experimenter for synthesis and collecting actual spectral data. In addition, DNN 7 uses the data of BO runs 1 to 6 for training and grid optimization; DNN 8 uses the data of BO runs 1 to 7 for training and grid optimization.

The loss function is defined as:

Where \(A_{{\mathrm{measured}}}^{{\mathrm{max}}}\) is the maximum value of the measured absorption spectrum. The cosine similarity between the measured spectrum and the target spectrum quantifies the shape similarity of the two spectra. Therefore, by using cosine similarity in the definition of the loss function, the shape of the absorption spectrum can be optimized. However, in order to avoid saturated or noisy optical measurements, BO and DNN suggest that they should be kept within the detection range of the UV-Vis spectrometer. The amplitude function \(\delta\) is designed for this purpose. It forces BO and DNN to recommend a condition where the maximum absorbance is higher than half of the detection limit of the spectrometer.

In order to quantify the most important process parameters and their impact on model performance, we use the SHAP (Shapley Additive exPlanations) algorithm to evaluate BO and DNN. The SHAP algorithm is a game theory method that can provide additional feature importance metrics for any ML model. The goal is to perform a prediction task on a single data point of the data set. The gain is the actual forecast for that data point minus the average forecast for all data points in the data set. Assume that all eigenvalues ​​of a data point collectively contribute to the gain. In this work, the eigenvalues ​​(\(Q_{\mathrm{AgNO}_{3}}\), QPVA, QTSC, Qseed, and Qtotal) work together to achieve the predicted value of the loss. Our goal is to explain the difference between the actual predicted value and the average predicted value in the entire data set. Specifically, the value of the difference should be partially assigned to the five features, and the partially assigned value should represent the importance of each feature at the specified data point.

Silver nitrate (99.9%) was purchased from Strem Chemicals Inc.Silver seeds (10 nm, 0.02 mg/mL in an aqueous buffer stabilized with sodium citrate), trisodium citrate dihydrate (TSC) (≥99.0%, ACS) and polyvinyl alcohol (PVA) (Mowiol 8-88, Mw ~ 67,000) were purchased from Sigma Aldrich. The silver seeds are used as they are. L-( )-Ascorbic acid (AA) (99%) was purchased from Alfa Aesar. Silicone oil (PMX-200, 10 cSt) was purchased from MegaChem Ltd. and used as it is. Obtain ultrapure water (18.2 MO, 25 °C) from the Milli-Q purifier.

Fill the Hamilton glass syringe with silver seed (0.02 mg/mL), TSC solution (15 mM), AA solution (10 mM), PVA solution (5 wt%), water, silver nitrate solution (6 mM) and silicone oil. The chemical reactants are selected according to literature 45,46. Their loading concentration is estimated based on the concentration found in the literature, taking into account that ten times higher concentration of AgNPs is required to measure the absorption spectrum in a 1 mm optical chamber.

The syringes containing the water phase are all connected to the 9-port PEEK manifold (Idex) via PTFE tubing. The manifold output and the oil injector are connected to a PEEK T-connector (1 mm through hole), allowing the controlled generation of monodisperse droplets.

Nanoparticles are synthesized in sub-microliter water droplets (see Figure 1). In such a flow system, the concentration of each reactant is proportional to the flow ratio Qi (%) between the flow rate of the reactants and the flow rate of the total water. By adjusting the flow rate of the solvent (water), the flow rate of silver species is higher than that of Qseed, silver nitrate \(Q_{\mathrm{AgNO}_{3}}\), QTSC of trisodium citrate and QPVA of PVA by using LabView The automatic syringe pump changes the flow rate of the corresponding solution for independent control. The flow rate of ascorbic acid remains constant compared to QAA. The mixing of the reactants in the droplet depends on the speed of the droplet, which is proportional to the total flow rate Qtotal (μL/min) of the oil and water phases. On-line measurement of the absorption spectrum of the droplet, the five control variables Qseed, \(Q_{\mathrm{AgNO}_{3}}\), QTSC, QPVA and Qtotal are used as the input parameter frame of the two-step optimization and the absorption spectrum is used as the output .

The boundary of the parameter space is defined by the accessible flow rate range of each solution. Since the mixing inside the droplet is directly related to the total flow rate in the reaction tube, the sum of all flow rates (Qtotal) varies between 200 and 1000 µL/min. The total flow rate of water is kept equal to the flow rate of oil to keep the droplet volume constant. The flow rate of ascorbic acid is always equal to 10% of the oil flow rate. The flow rate of the other five water phases can be selected within a certain percentage of the total water phase flow rate through the ML algorithm. First, in the initial parameter space, the flow rate of silver species, silver nitrate and TSC is kept between 4% and 20%, and the PVA solution is kept between 10% and 40%. Then, in the expanded parameter space, although the PVA flow rate remains between 10% and 40%, the flow restrictions for silver, silver nitrate and TSC are partially lifted, so the only remaining restriction is: (1) Silver The total flow rate of seed, silver nitrate, TSC and PVA should be kept below 90% of the water flow rate. (2) The flow rate of silver seed, silver nitrate and TSC should be kept above 0.5% of the total water flow rate.

After the T-joint, the droplets are forced to flow in a 1.25 m long PFA tube (1 mm ID). One meter behind the T-joint, the PFA reaction tube enters a customized optical chamber. For each condition suggested by the ML algorithm, droplets are generated until the first droplet leaves the optical chamber. Then reduce the total flow rate to 30 µL/min, and use a spectrometer (Flame-T-UV-Vis, Ocean Optics) combined with a deuterium-halogen light source (DH) to record the absorption spectrum of 20 continuous droplets at 1.4 fps-2000-BAL , Ocean Optics).

To verify the integration of total absorption spectra in the HTE loop, we used TEM (JEM-2100F) imaging to measure the size dispersion of synthetic nanoparticles. Since TEM imaging is time-consuming compared to online absorbance measurement, we only observe the nanoparticles under the best performance conditions for each run. Use MATLAB to perform statistical analysis on TEM images, using hundreds of particles for each condition.

The target spectrum is simulated by DDSCAT47 using the optical constants of silver 48. The simulation was performed on a triangular prism 50 nm wide and 10 nm thick, using 13,398 dipoles, and averaging the results of two incident polarizations and eight different angle directions around the target that is perpendicular to the direction of the incident light propagation. In order to limit the calculation time, a target spectrum of 25 wavelengths equidistant between 380 and 800 nm was calculated.

The data generated and analyzed during the current research can be found in our repository (https://github.com/acceleratedmaterials/AgBONN).

The details of our code implementation can be found in our repository (https://github.com/acceleratedmaterials/AgBONN).

Altae-Tran, H., Ramsundar, B., Pappu, AS & Pande, V. Low-data drug discovery with one-time learning. ACS Center for Science. 3, 283–293 (2017).

Chen, H., Engkvist, O., Wang, Y., Olivecrona, M. & Blaschke, T. The rise of deep learning in drug discovery. Drug Discovery Today 23, 1241–1250 (2018).

Erickson, BJ, Korfiatis, P., Akkus, Z. & Kline, TL Medical imaging machine learning. Radiography 37, 505–515 (2017).

Xie, T. & Grossman, JC Crystal graph convolutional neural network for accurate and interpretable prediction of material properties. physics. Pastor Wright. 120, 145301 (2018).

Yamawaki, M., Ohnishi, M., Ju, S. & Shiomi, J. used Bayesian optimization to design the multifunctional structure of graphene thermoelectrics. science. Advanced 4. eaar4192 (2018).

Butler, KT, Davies, DW, Cartwright, H., Isayev, O. & Walsh, A. Machine learning for molecular and materials science. Nature 559, 547–555 (2018).

Olivecrona, M., Blaschke, T., Engkvist, O. & Chen, H. De novo molecular design through deep reinforcement learning. J. Cheminform. 9, 48 (2017).

Nash, W., Drummond, T. and Birbilis, N. Review of deep learning in material degradation research. npj alma mater. Downgrade. 2, 37 (2018).

Himanen, L., Geurts, A., Foster, AS & Rinke, P. Data-driven materials science: current status, challenges and prospects. Advanced science. 6. 1900808 (2019).

Trivedi, V. etc. Modular method for the generation, storage, mixing and detection of droplet libraries for high-throughput screening. Lab Chip 10, 2433–2442 (2010).

Knauer, A. et al. The plasmonic properties of composite metal nanoparticles were screened by combinatorial synthesis in microfluidic fragment sequences. Chemical English. J. 227, 80–89 (2013).

Lignos, I. etc. Synthesis of cesium lead halide perovskite nanocrystals in a droplet-based microfluidic platform: fast parameter space mapping. Nanolet. 16, 1869–1877 (2016).

Epps, RW, Felton, KC, Coley, CW & Abolhasani, M. Automated microfluidic platform for the research of colloidal perovskite nanocrystal system: realizing continuous nanomanufacturing. Lab Chip 17, 4040-4047 (2017).

Bezinge, L., Maceiczyk, RM, Lignos, I., Kovalenko, MV & Demello, AJ Pick a color MARIA: Adaptive sampling can quickly identify complex perovskite nanocrystals with clear emission characteristics. ACS application alma mater. Interface 10, 18869–18878 (2018).

Salley, D. et al. A nanomaterial discovery robot used in the Darwinian evolution of shape-programmable gold nanoparticles. Nat. Community. 11, 2771 (2020).

Shabanzadeh, P., Yusof, R. & Shameli, K. A neural network model for predicting the size of silver nanoparticles in montmorillonite/starch synthesis by chemical reduction. dig. J. Nanomaterials. biology. 9, 1699–1711 (2014).

Thanh, NTK, Maclean, N. & Mahiddine, S. The nucleation and growth mechanism of nanoparticles in solution. Chemistry Rev. 114, 7610–7630 (2014).

Lee, J., Yang, J., Kwon, SG & Hyeon, T. Non-classical nucleation and growth of inorganic nanoparticles. Nat. Priest. 1, 16034 (2016).

Yuan, B. etc. Laser powder bed fusion monitoring based on machine learning. Senior alma mater. technology. 3. 1800136 (2018).

Kumar, J. et al. Machine learning realizes polymer cloud point engineering through reverse engineering. npj calculation. alma mater. 5, 73 (2019).

Voznyy, O. etc. Machine learning has accelerated the discovery of the best colloidal quantum dot synthesis. ACS Nano 13, 11122–11128 (2019).

Gonzalez, J., Dai, Z., Hennig, P. & Lawrence, N. Batch Bayesian optimization through local penalty. OK. The 19th International Conference on Artificial Intelligence and Statistics. 51, 648–657 (2016).

Dai, Z. etc. GPyOpt: Bayesian optimization framework in python. http://github.com/SheffieldML/GPyOpt (2016).

Shuford, KL, Ratner, MA & Schatz, GC Multipole excitation in triangular nanoprisms. J. Chemistry. physics. 123, 114713 (2005).

Aherne, D., Ledwith, DM, Gara, M. & Kelly, JM. Optical properties and growth aspects of silver nanoprisms produced by highly repeatable and rapid synthesis at room temperature. Advanced Features. alma mater. 18, 2005-2016 (2008).

Hamburg and so on. Mobile robotic chemist. Nature 583, 237–241 (2020).

Li, Z. etc. Robot-accelerated perovskite investigation and discovery. Chemistry alma mater. 32, 5650–5663 (2020).

Tshitoyan, V. etc. Unsupervised word embedding captures latent knowledge from materials science literature. Nature 571, 95–98 (2019).

Umehara, M. Analyze machine learning models to accelerate the generation of basic material insights. npj calculation. alma mater. 5, 34 (2019). &Wait.

Epps, RW etc. Artificial chemist: autonomous quantum dot synthesis robot. Senior alma mater. 32, 2001626 (2020).

Williams, T., McCullough, K. & Lauterbach, JA realize catalyst discovery through machine learning and high-throughput experiments. Chemistry alma mater. 32, 157–165 (2020).

Häse, F., Roch, LM, Kreisbeck, C. & Aspuru-Guzik, A. Phoenics: Chemical Bayesian optimizer. ACS Center for Science. 4, 1134–1145 (2018).

Wang, H. & Li, J. Adaptive Gaussian process approximation for Bayesian inference with expensive likelihood function. Neural Computing 30, 3072–3094 (2018).

Sakamoto, Y., Ishiguro, M. and Kitagawa, G. Akaike Information Standard Statistics (D. Reidel, Dordrecht, 1986).

Schwarz, G. Estimate the dimensionality of the model. install. statistics. 6, 461–464 (1978).

Wolpert, DH & Macready, WG Coevolutionary Free lunch. IEEE translation evolution. calculate. 9, 721–735 (2005).

Rohr, B. etc. Benchmark the acceleration of material discovery through sequential learning. Chemical science. 11, 2696–2706 (2020).

Jones, DR Classification of global optimization methods based on response surfaces. J. Globe. optimization. 21, 345–383 (2001).

Rasmussen CE Gaussian process in machine learning. In: Advanced Lectures on Machine Learning (Springer-Verlag, Heidelberg, Berlin, 2004).

Wang, ZL, Ogawa, T. & Adachi, Y. The influence of the algorithm parameters of Bayesian optimization, genetic algorithm and particle swarm optimization on its optimization performance. Advanced theoretical simulation. 2. 1900110 (2019).

Frazier, PI Bayesian optimization. The latest progress in optimization and modeling of contemporary problems, 255-278 (2018).

Stathakis, D. How many hidden layers and nodes are there? internationality. J. Remote Sens. 30, 2133–2147 (2009).

Goodfellow, I., Bengio, Y. and Courville, A. Deep Learning: Adaptive Computing and Machine Learning (Massachusetts Institute of Technology Press, Cambridge, 2016).

Lundberg, S. & Lee, SI A unified method for interpreting model predictions. in the process. The 31st Conference on Neural Information Processing Systems (NIPS, 2017).

Singh, AK, etc. Nonlinear optical properties of triangular silver nanoparticles. Chemical physics. Wright. 481, 94–98 (2009).

Potara, M., Gabudean, A.-M. & Astilean, S. Highly sensitive and stable solution phase dual LSPR-SERS plasma sensor based on chitosan-coated anisotropic silver nanoparticles. J. Matt. Chemistry 21, 3625–3633 (2011).

Drain, BT & Flatau, PJ Discrete dipole approximation for scattering calculations. J. Choice. society. Yes. A 11, 1491–1499 (1994).

Johnson, PB & Christy, RW Optical constants of precious metals. physics. Revision B 6, 4370–4379 (1972).

We would like to thank Swee Liang Wong, Lim Yee-Fun, Xu Yang, Jatin Kumar, Liu Xiali and Li Jiali for their equipment support and helpful discussions. A*STAR's Manufacturing Accelerated Material Development Program is supported by the AME Program Fund of the Science, Technology, and Research Institute. A1898b0043, (FMB, ZR, TH, WKW, FZ, JX, SJ, ZM, DB, KH, SAK, QL and XW) and the National Research Foundation of Singapore through the Singapore MIT Low Energy Electronic System Research Alliance (LEES) ) IRG (ZR, IPST and TB).

These authors made equal contributions: Flore Mekki-Berrada, Zekun Ren, Tan Huang.

Department of Chemistry and Biomolecular Engineering, National University of Singapore, Singapore, Singapore

Flore Mekki-Berrada, Tan Huang, Wai Kuan Wong, Fang Zheng, Jiaxun Xie, Saif Khan & Xiaonan Wang

Singapore-MIT Research and Technology Alliance SMART, Singapore, Singapore

Ren Zekun & Isaac Parker Tian Siyu

Institute of Information and Communication, Bureau of Science, Technology and Research (A*STAR), Singapore, Singapore

Institute of Materials Research and Engineering, Singapore, Singapore

Zackaria Mahfoud, Daniil Bash and Kedar Hippalgaonkar

Department of Materials Science and Engineering, Nanyang Technological University, Singapore

Massachusetts Institute of Technology, Cambridge, Massachusetts, USA

Department of Mathematics, National University of Singapore, Singapore, Singapore

Singapore High Performance Computing Institute

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

FMB, ZR, TB, and SAK conceived this research; FMB, WKW, FZ, and JX designed and supervised autonomous experiments; ZR, TH, IPST, SJ, TB, QL, and XW developed machine learning algorithms; FMB, DB conducted The plasmon resonance simulation was performed; FMB, ZM performed TEM; FMB, ZR, TH, and TB wrote manuscripts; KH, SAK, TB, QL, and XW supervised the research. FMB, ZR and TH are the co-first authors of this article.

The author declares no competing interests.

The publisher states that Springer Nature remains neutral on the jurisdiction claims of published maps and agency affiliates.

Open Access This article has been licensed under the Creative Commons Attribution 4.0 International License Agreement, which permits use, sharing, adaptation, distribution and reproduction in any media or format, as long as you appropriately indicate the original author and source, and provide a link to the Creative Commons license , And indicate whether any changes have been made. The images or other third-party materials in this article are included in the article’s Creative Commons license, unless otherwise stated in the material’s credit line. If the article’s Creative Commons license does not include the material, and your intended use is not permitted by laws and regulations or exceeds the permitted use, you need to obtain permission directly from the copyright owner. To view a copy of this license, please visit http://creativecommons.org/licenses/by/4.0/.

Mekki-Berrada, F., Ren, Z., Huang, T. etc. Two-step machine learning can optimize nanoparticle synthesis. npj Comput Mater 7, 55 (2021). https://doi.org/10.1038/s41524-021-00520-w

DOI: https://doi.org/10.1038/s41524-021-00520-w

Anyone you share the following link with can read this content:

Sorry, there is currently no shareable link in this article.

Provided by Springer Nature SharedIt content sharing program

npj Comput Mater ISSN 2057-3960 (online)